DD-$\alpha$AMG on QPACE 3
نویسندگان
چکیده
We describe our experience porting the Regensburg implementation of the DD-αAMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.
منابع مشابه
DD-αAMG on QPACE 3
We describe our experience porting the Regensburg implementation of the DD-αAMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present ...
متن کاملLattice QCD Applications on QPACE
QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making efficient use of these accelerator cores in scientific applica...
متن کاملLattice Boltzmann fluid-dynamics on the QPACE supercomputer
In this paper we present an implementation for the QPACE supercomputer of a Lattice Boltzmann model of a fluid-dynamics flow in 2 dimensions. QPACE is a massively parallel application-driven system powered by the Cell processor. We review the structure of the model, describe in details its implementation on QPACE and finally present performance data and preliminary physics results.
متن کامل0 3 9 Status of the QPACE Project
We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The nodes are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. We present a performance analy...
متن کامل